1. Objective

2. Overview


3. Data Description

3.2) Inspecting the dataset

Summary

##     Zip.Code         Customer.ID      Gender          Age        Married   
##  Min.   :90001   0002-ORFBO:   1   Female:3488   Min.   :19.00   No :3641  
##  1st Qu.:92101   0003-MKNFE:   1   Male  :3555   1st Qu.:32.00   Yes:3402  
##  Median :93518   0004-TLHLJ:   1                 Median :46.00             
##  Mean   :93486   0011-IGKFF:   1                 Mean   :46.51             
##  3rd Qu.:95329   0013-EXCHZ:   1                 3rd Qu.:60.00             
##  Max.   :96150   0013-MHZWF:   1                 Max.   :80.00             
##                  (Other)   :7037                                           
##  Number.of.Dependents            City         Latitude       Longitude     
##  Min.   :0.0000       Los Angeles  : 293   Min.   :32.56   Min.   :-124.3  
##  1st Qu.:0.0000       San Diego    : 285   1st Qu.:33.99   1st Qu.:-121.8  
##  Median :0.0000       San Jose     : 112   Median :36.21   Median :-119.6  
##  Mean   :0.4687       Sacramento   : 108   Mean   :36.20   Mean   :-119.8  
##  3rd Qu.:0.0000       San Francisco: 104   3rd Qu.:38.16   3rd Qu.:-118.0  
##  Max.   :9.0000       Fresno       :  61   Max.   :41.96   Max.   :-114.2  
##                       (Other)      :6080                                   
##  Number.of.Referrals Tenure.in.Months     Offer      Phone.Service
##  Min.   : 0.000      Min.   : 1.00    None   :3877   No : 682     
##  1st Qu.: 0.000      1st Qu.: 9.00    Offer A: 520   Yes:6361     
##  Median : 0.000      Median :29.00    Offer B: 824                
##  Mean   : 1.952      Mean   :32.39    Offer C: 415                
##  3rd Qu.: 3.000      3rd Qu.:55.00    Offer D: 602                
##  Max.   :11.000      Max.   :72.00    Offer E: 805                
##                                                                   
##  Avg.Monthly.Long.Distance.Charges Multiple.Lines Internet.Service
##  Min.   : 1.01                        : 682       No :1526        
##  1st Qu.:13.05                     No :3390       Yes:5517        
##  Median :25.69                     Yes:2971                       
##  Mean   :25.42                                                    
##  3rd Qu.:37.68                                                    
##  Max.   :49.99                                                    
##  NA's   :682                                                      
##      Internet.Type  Avg.Monthly.GB.Download Online.Security Online.Backup
##             :1526   Min.   : 2.00              :1526           :1526     
##  Cable      : 830   1st Qu.:13.00           No :3498        No :3088     
##  DSL        :1652   Median :21.00           Yes:2019        Yes:2429     
##  Fiber Optic:3035   Mean   :26.19                                        
##                     3rd Qu.:30.00                                        
##                     Max.   :85.00                                        
##                     NA's   :1526                                         
##  Device.Protection.Plan Premium.Tech.Support Streaming.TV Streaming.Movies
##     :1526                  :1526                :1526        :1526        
##  No :3095               No :3473             No :2810     No :2785        
##  Yes:2422               Yes:2044             Yes:2707     Yes:2732        
##                                                                           
##                                                                           
##                                                                           
##                                                                           
##  Streaming.Music Unlimited.Data           Contract    Paperless.Billing
##     :1526           :1526       Month-to-Month:3610   No :2872         
##  No :3029        No : 772       One Year      :1550   Yes:4171         
##  Yes:2488        Yes:4745       Two Year      :1883                    
##                                                                        
##                                                                        
##                                                                        
##                                                                        
##          Payment.Method Monthly.Charge   Total.Charges    Total.Refunds   
##  Bank Withdrawal:3909   Min.   :-10.00   Min.   :  18.8   Min.   : 0.000  
##  Credit Card    :2749   1st Qu.: 30.40   1st Qu.: 400.1   1st Qu.: 0.000  
##  Mailed Check   : 385   Median : 70.05   Median :1394.5   Median : 0.000  
##                         Mean   : 63.60   Mean   :2280.4   Mean   : 1.962  
##                         3rd Qu.: 89.75   3rd Qu.:3786.6   3rd Qu.: 0.000  
##                         Max.   :118.75   Max.   :8684.8   Max.   :49.790  
##                                                                           
##  Total.Extra.Data.Charges Total.Long.Distance.Charges Total.Revenue     
##  Min.   :  0.000          Min.   :   0.00             Min.   :   21.36  
##  1st Qu.:  0.000          1st Qu.:  70.55             1st Qu.:  605.61  
##  Median :  0.000          Median : 401.44             Median : 2108.64  
##  Mean   :  6.861          Mean   : 749.10             Mean   : 3034.38  
##  3rd Qu.:  0.000          3rd Qu.:1191.10             3rd Qu.: 4801.15  
##  Max.   :150.000          Max.   :3564.72             Max.   :11979.34  
##                                                                         
##  Customer.Status         Churn.Category                        Churn.Reason 
##  Churned:1869                   :5174                                :5174  
##  Joined : 454    Attitude       : 314   Competitor had better devices: 313  
##  Stayed :4720    Competitor     : 841   Competitor made better offer : 311  
##                  Dissatisfaction: 321   Attitude of support person   : 220  
##                  Other          : 182   Don't know                   : 130  
##                  Price          : 211   Competitor offered more data : 117  
##                                         (Other)                      : 778  
##    Population    
##  Min.   :    11  
##  1st Qu.:  2344  
##  Median : 17554  
##  Mean   : 22140  
##  3rd Qu.: 36125  
##  Max.   :105285  
## 

Strucure

## 'data.frame':    7043 obs. of  39 variables:
##  $ Zip.Code                         : int  90001 90001 90001 90001 90002 90002 90002 90002 90003 90003 ...
##  $ Customer.ID                      : Factor w/ 7043 levels "0002-ORFBO","0003-MKNFE",..: 5376 2308 86 2253 4701 3963 3129 3634 38 6795 ...
##  $ Gender                           : Factor w/ 2 levels "Female","Male": 1 2 2 1 2 2 2 2 1 1 ...
##  $ Age                              : int  36 36 71 66 31 46 56 68 60 49 ...
##  $ Married                          : Factor w/ 2 levels "No","Yes": 2 2 2 1 2 1 1 1 2 2 ...
##  $ Number.of.Dependents             : int  0 0 0 0 0 0 0 0 0 3 ...
##  $ City                             : Factor w/ 1106 levels "Acampo","Acton",..: 555 555 555 555 555 555 555 555 555 555 ...
##  $ Latitude                         : num  34 34 34 34 33.9 ...
##  $ Longitude                        : num  -118 -118 -118 -118 -118 ...
##  $ Number.of.Referrals              : int  0 1 1 0 9 0 0 0 4 2 ...
##  $ Tenure.in.Months                 : int  1 17 69 8 58 34 13 38 59 3 ...
##  $ Offer                            : Factor w/ 6 levels "None","Offer A",..: 6 1 1 6 1 1 1 1 3 1 ...
##  $ Phone.Service                    : Factor w/ 2 levels "No","Yes": 1 1 2 2 2 2 2 2 2 2 ...
##  $ Avg.Monthly.Long.Distance.Charges: num  NA NA 18.41 5.21 18.68 ...
##  $ Multiple.Lines                   : Factor w/ 3 levels "","No","Yes": 1 1 3 2 2 2 2 3 3 2 ...
##  $ Internet.Service                 : Factor w/ 2 levels "No","Yes": 2 2 2 2 1 2 1 2 2 2 ...
##  $ Internet.Type                    : Factor w/ 4 levels "","Cable","DSL",..: 3 2 4 4 1 3 1 4 4 4 ...
##  $ Avg.Monthly.GB.Download          : int  10 10 17 8 NA 16 NA 13 14 22 ...
##  $ Online.Security                  : Factor w/ 3 levels "","No","Yes": 2 3 2 2 1 3 1 2 3 2 ...
##  $ Online.Backup                    : Factor w/ 3 levels "","No","Yes": 3 2 3 2 1 2 1 2 3 3 ...
##  $ Device.Protection.Plan           : Factor w/ 3 levels "","No","Yes": 2 3 3 3 1 3 1 2 2 2 ...
##  $ Premium.Tech.Support             : Factor w/ 3 levels "","No","Yes": 2 2 3 2 1 2 1 2 2 3 ...
##  $ Streaming.TV                     : Factor w/ 3 levels "","No","Yes": 2 2 3 3 1 2 1 3 3 2 ...
##  $ Streaming.Movies                 : Factor w/ 3 levels "","No","Yes": 2 2 3 3 1 2 1 3 2 2 ...
##  $ Streaming.Music                  : Factor w/ 3 levels "","No","Yes": 2 2 2 2 1 2 1 2 2 2 ...
##  $ Unlimited.Data                   : Factor w/ 3 levels "","No","Yes": 3 3 3 3 1 3 1 3 3 3 ...
##  $ Contract                         : Factor w/ 3 levels "Month-to-Month",..: 1 1 3 1 3 2 1 3 1 1 ...
##  $ Paperless.Billing                : Factor w/ 2 levels "No","Yes": 2 1 2 2 1 1 2 1 2 2 ...
##  $ Payment.Method                   : Factor w/ 3 levels "Bank Withdrawal",..: 1 3 1 2 2 3 1 1 1 1 ...
##  $ Monthly.Charge                   : num  29.9 34.4 110 94.5 -4 ...
##  $ Total.Charges                    : num  29.9 592.8 7634.2 743 1186 ...
##  $ Total.Refunds                    : num  0 0 0 0 0 ...
##  $ Total.Extra.Data.Charges         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Total.Long.Distance.Charges      : num  0 0 1270.3 41.7 1083.4 ...
##  $ Total.Revenue                    : num  29.9 592.8 8904.5 784.6 2269.4 ...
##  $ Customer.Status                  : Factor w/ 3 levels "Churned","Joined",..: 2 3 3 1 3 3 3 3 3 2 ...
##  $ Churn.Category                   : Factor w/ 6 levels "","Attitude",..: 1 1 1 3 1 1 1 1 1 1 ...
##  $ Churn.Reason                     : Factor w/ 21 levels "","Attitude of service provider",..: 1 1 1 4 1 1 1 1 1 1 ...
##  $ Population                       : int  54492 54492 54492 54492 44586 44586 44586 44586 58198 58198 ...

4. Map View

Map view of all observation

Count of users in each ZipCode

Percentage of User Churned in specific ZipCode

5. Data Cleaning and manipulation

mergedf3 <- mergedf
mergedf3[is.na(mergedf3)] = 0 #Replacing NA values with 0 from avg.monthly charges and avg.monthly GB used 
mergedf3$Multiple.Lines[mergedf3$Multiple.Lines == ""] <- "No" #Replacing blank values with "No"
transformedf <- mergedf3
transformedf$Offer <- recode_factor(transformedf$Offer, "None" = 0, "Offer A" = 1, "Offer B" = 2, "Offer C" = 3, "Offer D" = 4, "Offer E" = 5)
transformedf$Phone.Service <- recode_factor(transformedf$Phone.Service, No = 0, Yes = 1)
transformedf$Multiple.Lines <- recode_factor(transformedf$Multiple.Lines, No = 0, Yes = 1)
transformedf$Internet.Service <- recode_factor(transformedf$Internet.Service, No = 0, Yes =1)
transformedf$Internet.Type <- recode(transformedf$Internet.Type, "Cable" = 1, "DSL" = 2, "Fiber Optic" = 3)
transformedf$Internet.Type <- recode_factor(transformedf$Internet.Type, "Cable" = 1, "DSL" = 2, "Fiber Optic" = 3, .missing = 0)
transformedf$Online.Security <- recode(transformedf$Online.Security, No = 0, Yes = 1) #Unmatched values(Blanks in here) are coverted to NAs
transformedf$Online.Security <- recode_factor(transformedf$Online.Security, No = 0, Yes = 1, .missing = 0) #NAs are recoded as 0
transformedf$Online.Backup <- recode(transformedf$Online.Backup, No = 0, Yes = 1)
transformedf$Online.Backup <- recode_factor(transformedf$Online.Backup, No = 0, Yes = 1, .missing = 0)
transformedf$Device.Protection.Plan <- recode(transformedf$Device.Protection.Plan, No = 0, Yes = 1)
transformedf$Device.Protection.Plan <- recode_factor(transformedf$Device.Protection.Plan, No = 0, Yes = 1, .missing = 0)
transformedf$Premium.Tech.Support <- recode(transformedf$Premium.Tech.Support, No = 0, Yes = 1)
transformedf$Premium.Tech.Support <- recode_factor(transformedf$Premium.Tech.Support, No = 0, Yes = 1, .missing = 0)
transformedf$Streaming.TV <- recode(transformedf$Streaming.TV, No = 0, Yes = 1)
transformedf$Streaming.TV <- recode_factor(transformedf$Streaming.TV, No = 0, Yes = 1, .missing = 0)
transformedf$Streaming.Movies <- recode(transformedf$Streaming.Movies, No = 0, Yes = 1)
transformedf$Streaming.Movies <- recode_factor(transformedf$Streaming.Movies, No = 0, Yes = 1, .missing = 0)
transformedf$Streaming.Music <- recode(transformedf$Streaming.Music, No = 0, Yes = 1)
transformedf$Streaming.Music <- recode_factor(transformedf$Streaming.Music, No = 0, Yes = 1, .missing = 0)
transformedf$Unlimited.Data <- recode(transformedf$Unlimited.Data, No = 0, Yes = 1)
transformedf$Unlimited.Data <- recode_factor(transformedf$Unlimited.Data, No = 0, Yes = 1, .missing = 0)
transformedf$Contract <- recode_factor(transformedf$Contract, "Month-to-Month" = 1, "One Year" = 2, "Two Year" = 3)
transformedf$Paperless.Billing <- recode_factor(transformedf$Paperless.Billing, No = 0, Yes = 1)
transformedf$Payment.Method <- recode_factor(transformedf$Payment.Method, "Bank Withdrawal" = 1, "Credit Card" = 2, "Mailed Check" = 3)
transformedf$Churn.Category <- recode(transformedf$Churn.Category, "Attitude" = 1, "Competitor" = 2, "Dissatisfaction" = 3, "Price" = 4, "Other" = 5)
transformedf$Churn.Category <- recode_factor(transformedf$Churn.Category, "Attitude" = 1, "Competitor" = 2, "Dissatisfaction" = 3, "Price" = 4, "Other" = 5, .missing = 0)

Summary

##     Zip.Code         Customer.ID      Gender          Age        Married   
##  Min.   :90001   0002-ORFBO:   1   Female:3488   Min.   :19.00   No :3641  
##  1st Qu.:92101   0003-MKNFE:   1   Male  :3555   1st Qu.:32.00   Yes:3402  
##  Median :93518   0004-TLHLJ:   1                 Median :46.00             
##  Mean   :93486   0011-IGKFF:   1                 Mean   :46.51             
##  3rd Qu.:95329   0013-EXCHZ:   1                 3rd Qu.:60.00             
##  Max.   :96150   0013-MHZWF:   1                 Max.   :80.00             
##                  (Other)   :7037                                           
##  Number.of.Dependents            City         Latitude       Longitude     
##  Min.   :0.0000       Los Angeles  : 293   Min.   :32.56   Min.   :-124.3  
##  1st Qu.:0.0000       San Diego    : 285   1st Qu.:33.99   1st Qu.:-121.8  
##  Median :0.0000       San Jose     : 112   Median :36.21   Median :-119.6  
##  Mean   :0.4687       Sacramento   : 108   Mean   :36.20   Mean   :-119.8  
##  3rd Qu.:0.0000       San Francisco: 104   3rd Qu.:38.16   3rd Qu.:-118.0  
##  Max.   :9.0000       Fresno       :  61   Max.   :41.96   Max.   :-114.2  
##                       (Other)      :6080                                   
##  Number.of.Referrals Tenure.in.Months Offer    Phone.Service
##  Min.   : 0.000      Min.   : 1.00    0:3877   0: 682       
##  1st Qu.: 0.000      1st Qu.: 9.00    1: 520   1:6361       
##  Median : 0.000      Median :29.00    2: 824                
##  Mean   : 1.952      Mean   :32.39    3: 415                
##  3rd Qu.: 3.000      3rd Qu.:55.00    4: 602                
##  Max.   :11.000      Max.   :72.00    5: 805                
##                                                             
##  Avg.Monthly.Long.Distance.Charges Multiple.Lines Internet.Service
##  Min.   : 0.00                     0:4072         0:1526          
##  1st Qu.: 9.21                     1:2971         1:5517          
##  Median :22.89                                                    
##  Mean   :22.96                                                    
##  3rd Qu.:36.40                                                    
##  Max.   :49.99                                                    
##                                                                   
##  Internet.Type Avg.Monthly.GB.Download Online.Security Online.Backup
##  1: 830        Min.   : 0.00           0:5024          0:4614       
##  2:1652        1st Qu.: 3.00           1:2019          1:2429       
##  3:3035        Median :17.00                                        
##  0:1526        Mean   :20.52                                        
##                3rd Qu.:27.00                                        
##                Max.   :85.00                                        
##                                                                     
##  Device.Protection.Plan Premium.Tech.Support Streaming.TV Streaming.Movies
##  0:4621                 0:4999               0:4336       0:4311          
##  1:2422                 1:2044               1:2707       1:2732          
##                                                                           
##                                                                           
##                                                                           
##                                                                           
##                                                                           
##  Streaming.Music Unlimited.Data Contract Paperless.Billing Payment.Method
##  0:4555          0:2298         1:3610   0:2872            1:3909        
##  1:2488          1:4745         2:1550   1:4171            2:2749        
##                                 3:1883                     3: 385        
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##  Monthly.Charge   Total.Charges    Total.Refunds    Total.Extra.Data.Charges
##  Min.   :-10.00   Min.   :  18.8   Min.   : 0.000   Min.   :  0.000         
##  1st Qu.: 30.40   1st Qu.: 400.1   1st Qu.: 0.000   1st Qu.:  0.000         
##  Median : 70.05   Median :1394.5   Median : 0.000   Median :  0.000         
##  Mean   : 63.60   Mean   :2280.4   Mean   : 1.962   Mean   :  6.861         
##  3rd Qu.: 89.75   3rd Qu.:3786.6   3rd Qu.: 0.000   3rd Qu.:  0.000         
##  Max.   :118.75   Max.   :8684.8   Max.   :49.790   Max.   :150.000         
##                                                                             
##  Total.Long.Distance.Charges Total.Revenue      Customer.Status Churn.Category
##  Min.   :   0.00             Min.   :   21.36   Churned:1869    1: 314        
##  1st Qu.:  70.55             1st Qu.:  605.61   Joined : 454    2: 841        
##  Median : 401.44             Median : 2108.64   Stayed :4720    3: 321        
##  Mean   : 749.10             Mean   : 3034.38                   4: 211        
##  3rd Qu.:1191.10             3rd Qu.: 4801.15                   5: 182        
##  Max.   :3564.72             Max.   :11979.34                   0:5174        
##                                                                               
##                         Churn.Reason    Population    
##                               :5174   Min.   :    11  
##  Competitor had better devices: 313   1st Qu.:  2344  
##  Competitor made better offer : 311   Median : 17554  
##  Attitude of support person   : 220   Mean   : 22140  
##  Don't know                   : 130   3rd Qu.: 36125  
##  Competitor offered more data : 117   Max.   :105285  
##  (Other)                      : 778

Structure

## 'data.frame':    7043 obs. of  39 variables:
##  $ Zip.Code                         : int  90001 90001 90001 90001 90002 90002 90002 90002 90003 90003 ...
##  $ Customer.ID                      : Factor w/ 7043 levels "0002-ORFBO","0003-MKNFE",..: 5376 2308 86 2253 4701 3963 3129 3634 38 6795 ...
##  $ Gender                           : Factor w/ 2 levels "Female","Male": 1 2 2 1 2 2 2 2 1 1 ...
##  $ Age                              : int  36 36 71 66 31 46 56 68 60 49 ...
##  $ Married                          : Factor w/ 2 levels "No","Yes": 2 2 2 1 2 1 1 1 2 2 ...
##  $ Number.of.Dependents             : int  0 0 0 0 0 0 0 0 0 3 ...
##  $ City                             : Factor w/ 1106 levels "Acampo","Acton",..: 555 555 555 555 555 555 555 555 555 555 ...
##  $ Latitude                         : num  34 34 34 34 33.9 ...
##  $ Longitude                        : num  -118 -118 -118 -118 -118 ...
##  $ Number.of.Referrals              : int  0 1 1 0 9 0 0 0 4 2 ...
##  $ Tenure.in.Months                 : int  1 17 69 8 58 34 13 38 59 3 ...
##  $ Offer                            : Factor w/ 6 levels "0","1","2","3",..: 6 1 1 6 1 1 1 1 3 1 ...
##  $ Phone.Service                    : Factor w/ 2 levels "0","1": 1 1 2 2 2 2 2 2 2 2 ...
##  $ Avg.Monthly.Long.Distance.Charges: num  0 0 18.41 5.21 18.68 ...
##  $ Multiple.Lines                   : Factor w/ 2 levels "0","1": 1 1 2 1 1 1 1 2 2 1 ...
##  $ Internet.Service                 : Factor w/ 2 levels "0","1": 2 2 2 2 1 2 1 2 2 2 ...
##  $ Internet.Type                    : Factor w/ 4 levels "1","2","3","0": 2 1 3 3 4 2 4 3 3 3 ...
##  $ Avg.Monthly.GB.Download          : num  10 10 17 8 0 16 0 13 14 22 ...
##  $ Online.Security                  : Factor w/ 2 levels "0","1": 1 2 1 1 1 2 1 1 2 1 ...
##  $ Online.Backup                    : Factor w/ 2 levels "0","1": 2 1 2 1 1 1 1 1 2 2 ...
##  $ Device.Protection.Plan           : Factor w/ 2 levels "0","1": 1 2 2 2 1 2 1 1 1 1 ...
##  $ Premium.Tech.Support             : Factor w/ 2 levels "0","1": 1 1 2 1 1 1 1 1 1 2 ...
##  $ Streaming.TV                     : Factor w/ 2 levels "0","1": 1 1 2 2 1 1 1 2 2 1 ...
##  $ Streaming.Movies                 : Factor w/ 2 levels "0","1": 1 1 2 2 1 1 1 2 1 1 ...
##  $ Streaming.Music                  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Unlimited.Data                   : Factor w/ 2 levels "0","1": 2 2 2 2 1 2 1 2 2 2 ...
##  $ Contract                         : Factor w/ 3 levels "1","2","3": 1 1 3 1 3 2 1 3 1 1 ...
##  $ Paperless.Billing                : Factor w/ 2 levels "0","1": 2 1 2 2 1 1 2 1 2 2 ...
##  $ Payment.Method                   : Factor w/ 3 levels "1","2","3": 1 3 1 2 2 3 1 1 1 1 ...
##  $ Monthly.Charge                   : num  29.9 34.4 110 94.5 -4 ...
##  $ Total.Charges                    : num  29.9 592.8 7634.2 743 1186 ...
##  $ Total.Refunds                    : num  0 0 0 0 0 ...
##  $ Total.Extra.Data.Charges         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Total.Long.Distance.Charges      : num  0 0 1270.3 41.7 1083.4 ...
##  $ Total.Revenue                    : num  29.9 592.8 8904.5 784.6 2269.4 ...
##  $ Customer.Status                  : Factor w/ 3 levels "Churned","Joined",..: 2 3 3 1 3 3 3 3 3 2 ...
##  $ Churn.Category                   : Factor w/ 6 levels "1","2","3","4",..: 6 6 6 2 6 6 6 6 6 6 ...
##  $ Churn.Reason                     : Factor w/ 21 levels "","Attitude of service provider",..: 1 1 1 4 1 1 1 1 1 1 ...
##  $ Population                       : int  54492 54492 54492 54492 44586 44586 44586 44586 58198 58198 ...

Head

Zip.Code Customer.ID Gender Age Married Number.of.Dependents City Latitude Longitude Number.of.Referrals Tenure.in.Months Offer Phone.Service Avg.Monthly.Long.Distance.Charges Multiple.Lines Internet.Service Internet.Type Avg.Monthly.GB.Download Online.Security Online.Backup Device.Protection.Plan Premium.Tech.Support Streaming.TV Streaming.Movies Streaming.Music Unlimited.Data Contract Paperless.Billing Payment.Method Monthly.Charge Total.Charges Total.Refunds Total.Extra.Data.Charges Total.Long.Distance.Charges Total.Revenue Customer.Status Churn.Category Churn.Reason Population
90001 7590-VHVEG Female 36 Yes 0 Los Angeles 33.97362 -118.249 0 1 5 0 0.00 0 1 2 10 0 1 0 0 0 0 0 1 1 1 1 29.85 29.85 0 0 0.00 29.85 Joined 0 54492
90001 3307-TLCUD Male 36 Yes 0 Los Angeles 33.97362 -118.249 1 17 0 0 0.00 0 1 1 10 1 0 1 0 0 0 0 1 1 0 3 34.40 592.75 0 0 0.00 592.75 Stayed 0 54492
90001 0136-IFMYD Male 71 Yes 0 Los Angeles 33.97362 -118.249 1 69 0 1 18.41 1 1 3 17 0 1 1 1 1 1 0 1 3 1 1 109.95 7634.25 0 0 1270.29 8904.54 Stayed 0 54492
90001 3217-FZDMN Female 66 No 0 Los Angeles 33.97362 -118.249 0 8 5 1 5.21 0 1 3 8 0 0 1 0 1 1 0 1 1 1 2 94.45 742.95 0 0 41.68 784.63 Churned 2 Competitor had better devices 54492
90002 6625-FLENO Male 31 Yes 0 Los Angeles 33.94926 -118.247 9 58 0 1 18.68 0 0 0 0 0 0 0 0 0 0 0 0 3 0 2 -4.00 1185.95 0 0 1083.44 2269.39 Stayed 0 44586
90002 5575-GNVDE Male 46 No 0 Los Angeles 33.94926 -118.247 0 34 0 1 17.09 0 1 2 16 1 0 1 0 0 0 0 1 2 0 3 56.95 1889.50 0 0 581.06 2470.56 Stayed 0 44586

7. Preliminary Data Analysis

Chi-square and Cramer’s V

Variable statistic p.value V1 Cramers_Value
Contract 1441.3859 0.0000000 3 0.3823029
Device.Protection.Plan 156.9618 0.0000000 2 0.1261579
Internet.Service 309.0694 0.0000000 2 0.1770294
Internet.Type 532.7079 0.0000000 4 0.2324139
Offer 795.4861 0.0000000 6 0.2840101
Online.Backup 138.4891 0.0000000 2 0.1185019
Online.Security 227.8757 0.0000000 2 0.1520080
Paperless.Billing 205.1861 0.0000000 2 0.1442419
Payment.Method 258.3974 0.0000000 3 0.1618682
Phone.Service 3.8192 0.1481396 2 0.0196790
Premium.Tech.Support 245.9676 0.0000000 2 0.1579270
Streaming.Movies 133.0671 0.0000000 2 0.1161590
Streaming.Music 102.2052 0.0000000 2 0.1018015
Streaming.TV 133.7242 0.0000000 2 0.1164455
Unlimited.Data 168.5999 0.0000000 2 0.1307513

MANOVA

##                   Df  Pillai approx F num Df den Df    Pr(>F)    
## Customer.Status    2 0.45123    119.4     24   9836 < 2.2e-16 ***
## Residuals       4928                                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##  Response Age :
##                   Df  Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2   19843  9921.3    35.7 4.047e-16 ***
## Residuals       4928 1369541   277.9                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Number.of.Dependents :
##                   Df Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2  238.5 119.245  139.05 < 2.2e-16 ***
## Residuals       4928 4226.1   0.858                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Number.of.Referrals :
##                   Df Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2   4209 2104.49  264.36 < 2.2e-16 ***
## Residuals       4928  39231    7.96                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Tenure.in.Months :
##                   Df  Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2  848588  424294  982.02 < 2.2e-16 ***
## Residuals       4928 2129193     432                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Avg.Monthly.Long.Distance.Charges :
##                   Df  Sum Sq Mean Sq F value Pr(>F)
## Customer.Status    2     322  161.01  0.6742 0.5096
## Residuals       4928 1176863  238.81               
## 
##  Response Avg.Monthly.GB.Download :
##                   Df  Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2   15492  7746.2  18.794 7.395e-09 ***
## Residuals       4928 2031164   412.2                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Total.Extra.Data.Charges :
##                   Df  Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2    9543  4771.4  7.4154 0.0006086 ***
## Residuals       4928 3170873   643.4                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Total.Long.Distance.Charges :
##                   Df     Sum Sq   Mean Sq F value    Pr(>F)    
## Customer.Status    2  425524334 212762167  335.61 < 2.2e-16 ***
## Residuals       4928 3124150840    633959                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Total.Revenue :
##                   Df     Sum Sq    Mean Sq F value    Pr(>F)    
## Customer.Status    2 6.0482e+09 3024094638  432.22 < 2.2e-16 ***
## Residuals       4928 3.4480e+10    6996664                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Monthly.Charge :
##                   Df  Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2  292149  146074  159.66 < 2.2e-16 ***
## Residuals       4928 4508547     915                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Total.Charges :
##                   Df     Sum Sq    Mean Sq F value    Pr(>F)    
## Customer.Status    2 3.2710e+09 1635484274  363.77 < 2.2e-16 ***
## Residuals       4928 2.2156e+10    4495902                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Total.Refunds :
##                   Df Sum Sq Mean Sq F value    Pr(>F)    
## Customer.Status    2   1555  777.60   13.11 2.097e-06 ***
## Residuals       4928 292304   59.32                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation

Age Number.of.Dependents Number.of.Referrals Tenure.in.Months Avg.Monthly.Long.Distance.Charges Avg.Monthly.GB.Download Monthly.Charge Total.Charges Total.Refunds Total.Extra.Data.Charges Total.Long.Distance.Charges Total.Revenue
Age 1.0000000 -0.1233383 -0.0310285 0.0085981 -0.0135942 -0.3724461 0.1297730 0.0536561 0.0201893 0.0099749 -0.0056391 0.0408650
Number.of.Dependents -0.1233383 1.0000000 0.2865629 0.1132867 -0.0111846 0.1426338 -0.1217353 0.0249403 0.0120779 -0.0196748 0.0686656 0.0398695
Number.of.Referrals -0.0310285 0.2865629 1.0000000 0.3153105 0.0063733 0.0384084 0.0200994 0.2388944 0.0165633 -0.0052310 0.2115740 0.2517480
Tenure.in.Months 0.0085981 0.1132867 0.3153105 1.0000000 0.0083527 0.0518238 0.2379141 0.8251031 0.0622769 0.0868819 0.6719726 0.8530203
Avg.Monthly.Long.Distance.Charges -0.0135942 -0.0111846 0.0063733 0.0083527 1.0000000 -0.0261397 0.1306104 0.0638032 -0.0089866 0.0022397 0.5989450 0.2278390
Avg.Monthly.GB.Download -0.3724461 0.1426338 0.0384084 0.0518238 -0.0261397 1.0000000 0.3713364 0.2285263 0.0050984 0.0907955 0.0170669 0.1868528
Monthly.Charge 0.1297730 -0.1217353 0.0200994 0.2379141 0.1306104 0.3713364 1.0000000 0.6254960 0.0210074 0.1150761 0.2328548 0.5653195
Total.Charges 0.0536561 0.0249403 0.2388944 0.8251031 0.0638032 0.2285263 0.6254960 1.0000000 0.0403317 0.1197636 0.6039930 0.9717846
Total.Refunds 0.0201893 0.0120779 0.0165633 0.0622769 -0.0089866 0.0050984 0.0210074 0.0403317 1.0000000 0.0022219 0.0367774 0.0401572
Total.Extra.Data.Charges 0.0099749 -0.0196748 -0.0052310 0.0868819 0.0022397 0.0907955 0.1150761 0.1197636 0.0022219 1.0000000 0.0600591 0.1214895
Total.Long.Distance.Charges -0.0056391 0.0686656 0.2115740 0.6719726 0.5989450 0.0170669 0.2328548 0.6039930 0.0367774 0.0600591 1.0000000 0.7747940
Total.Revenue 0.0408650 0.0398695 0.2517480 0.8530203 0.2278390 0.1868528 0.5653195 0.9717846 0.0401572 0.1214895 0.7747940 1.0000000

8. Multinomial Logistic Regression

Model Summary

## Call:
## multinom(formula = Customer.Status ~ Age + Number.of.Dependents + 
##     Number.of.Referrals + Tenure.in.Months + Offer + Phone.Service + 
##     Avg.Monthly.Long.Distance.Charges + Multiple.Lines + Internet.Service + 
##     Internet.Type + Avg.Monthly.GB.Download + Online.Security + 
##     Online.Backup + Device.Protection.Plan + Premium.Tech.Support + 
##     Streaming.TV + Streaming.Movies + Streaming.Music + Unlimited.Data + 
##     Contract + Paperless.Billing + Payment.Method + Monthly.Charge + 
##     Total.Charges + Total.Refunds + Total.Long.Distance.Charges + 
##     Total.Revenue, data = traindf_pred2)
## 
## Coefficients:
##        (Intercept)         Age Number.of.Dependents Number.of.Referrals
## Joined   1.9603511 -0.03082223            0.3205773           0.3305210
## Stayed  -0.3169966 -0.01679600            0.5478115           0.2502277
##        Tenure.in.Months    Offer1     Offer2    Offer3     Offer4     Offer5
## Joined      -0.95900384 23.549183 10.1865275 0.5313735 -8.1349470 -0.3594837
## Stayed       0.09250368 -1.287368  0.1061509 0.2216949  0.7543357 -0.3631852
##        Phone.Service1 Avg.Monthly.Long.Distance.Charges Multiple.Lines1
## Joined      0.5212698                      -0.004649323      -0.5270837
## Stayed      0.6956961                       0.003546937      -0.1104598
##        Internet.Service1 Internet.Type2 Internet.Type3 Internet.Type0
## Joined         0.4653483      0.3908882     -0.6612113      1.4950027
## Stayed        -0.5381925      0.4673270      0.3413906      0.2211959
##        Avg.Monthly.GB.Download Online.Security1 Online.Backup1
## Joined            -0.002371780        0.5702509      0.6440871
## Stayed            -0.004288278        0.5487430      0.3263126
##        Device.Protection.Plan1 Premium.Tech.Support1 Streaming.TV1
## Joined              0.06356905             0.5755499    -0.2902495
## Stayed              0.14652193             0.5452480    -0.1623621
##        Streaming.Movies1 Streaming.Music1 Unlimited.Data1 Contract2 Contract3
## Joined         1.3570234       -1.7336383     -0.33594223  2.107196  3.630894
## Stayed         0.1798889       -0.1109168      0.02103502  1.351127  2.872753
##        Paperless.Billing1 Payment.Method2 Payment.Method3 Monthly.Charge
## Joined         -0.4922908       0.6859813      -0.5014792   -0.002071048
## Stayed         -0.2423102       0.4159249      -0.5312636   -0.011891290
##        Total.Charges Total.Refunds Total.Long.Distance.Charges Total.Revenue
## Joined  0.0031422737   0.009208220               -0.0010371218  0.0007649897
## Stayed -0.0003954708   0.006072633                0.0002451031 -0.0003175382
## 
## Std. Errors:
##        (Intercept)         Age Number.of.Dependents Number.of.Referrals
## Joined 0.001706426 0.003754653            0.0218358          0.04522931
## Stayed 0.002729437 0.002442088            0.0565553          0.02474783
##        Tenure.in.Months       Offer1       Offer2       Offer3       Offer4
## Joined      0.007100867 8.104668e-06 4.419837e-07 3.205531e-08 3.841388e-09
## Stayed      0.007684787 2.729151e-04 3.213364e-04 1.162494e-04 2.556609e-04
##              Offer5 Phone.Service1 Avg.Monthly.Long.Distance.Charges
## Joined 0.0006479004    0.001770351                       0.007799667
## Stayed 0.0010671972    0.003084193                       0.004258883
##        Multiple.Lines1 Internet.Service1 Internet.Type2 Internet.Type3
## Joined    0.0005638495       0.001570233   0.0005032142    0.001177688
## Stayed    0.0003208895       0.002722709   0.0005540282    0.003801231
##        Internet.Type0 Avg.Monthly.GB.Download Online.Security1 Online.Backup1
## Joined    0.003211291             0.004674395     0.0004801799   0.0005627926
## Stayed    0.005395436             0.002513387     0.0016159531   0.0010616302
##        Device.Protection.Plan1 Premium.Tech.Support1 Streaming.TV1
## Joined            0.0002420052          0.0004836781  0.0003142431
## Stayed            0.0009277510          0.0015734487  0.0007492829
##        Streaming.Movies1 Streaming.Music1 Unlimited.Data1    Contract2
## Joined      0.0004481977     0.0002423772     0.001871512 0.0002026849
## Stayed      0.0007094045     0.0007221277     0.003129928 0.0003659569
##           Contract3 Paperless.Billing1 Payment.Method2 Payment.Method3
## Joined 0.0002067809       0.0009340948     0.002852559    0.0002641255
## Stayed 0.0004743466       0.0031277831     0.005896004    0.0005684454
##        Monthly.Charge Total.Charges Total.Refunds Total.Long.Distance.Charges
## Joined    0.004396115   0.009263191   0.023705076                 0.009740034
## Stayed    0.002213205   0.001630616   0.006066285                 0.001645722
##        Total.Revenue
## Joined   0.009209219
## Stayed   0.001632669
## 
## Residual Deviance: 4044.068 
## AIC: 4184.068